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Abstract. In this paper we investigate the uniform distribution properties of poly- 
nomials in many variables and bounded degree over a fixed finite field F of prime order. 
Gur main result is that a polynomial P : F" ^ F is poorly-distributed only if P is 
determined by the values of a few polynomials of lower degree, in which case we say 
that P has small rank. 

We give several applications of this result, paying particular attention to conse- 
quences for the theory of the so-called Gowers norms. We establish an inverse result 
for the Gowers W'^^-norm of functions of the form f{x) — er{P{x)), where P : F" F 
is a polynomial of degree less than |F|, showing that this norm can only be large if / 
correlates with er{Q{x)) for some polynomial Q : F" — > F of degree at most d. 

The requirement deg(P) < |F| cannot be dropped entirely. Indeed, we show the 
above claim fails in characteristic 2 when d = 3 and deg(P) = 4, showing that the 
quartic symmetric polynomial ^4 in Fj has large Gowers [/^-norm but does not corre- 
late strongly with any cubic polynomial. This shows that the theory of Gowers norms 
in low characteristic is not as simple as previously supposed. This counterexample has 
also been discovered independently by Lovett, Meshulam, and Samorodnitsky [15 . 

We conclude with sundry other applications of our main result, including a recur- 
rence result and a certain type of nuUstellensatz. 



1. Introduction 



Let F be a finite field of prime order. Throughout this paper, F will be considered fixed 
(e.g. F = F2 or F = F3) and we shall be working inside the n-dimensional vector spaces 
F" over F for various natural numbers n. More generally, any linear algebra term (e.g. 
span, independence, basis, subspace, hnear transformation, etc.) will be understood to 
be over the field F. 

If / : F" — s> C is a function, and /i G F" is a shift, we define the (multiplicative) 
derivative Ahf : F" ^ C of / by the formula 

A.fix) :=fix + h)Jix). 

An important special case arises when / takes the form / = e^lP), where P : F" — > F is a 
function, and : F — C is the standard character e^{j) := e^'^^^^^^^ for j = 0, . . . , |F| — 1. 
In that case we see that Ahf = e^{DhP), where DhP : F" ^ F is the (additive) 
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derivative of P, defined as 

DhP{x) := P{x + h)- P{x). 

Given an integer li ^ 0, we say that a function P : F" — F is a polynomial of degree 
at most d if we have Dh^ ■ ■ ■ Dh^^-^P = for all hi, ... , hd+i G F", and write Prf(F"') for 
the space of all polynomials on F" of degree at most d. Thus for instance Po(IF") is the 
space of constants, Pi(F"') is the space of linear polynomials on F", ^2(1^") is the space 
of quadratic polynomials, and so forth. It is easy to see that ^^(F"') is a vector space 
and that, with an obvious notation, the monomials x^i . . . x^" for ^ ii, ...,«„< |F| 
and ii + . . . + in ^ d form a basis. (The restriction zi, . . . , i„ < |F| arises of course from 
the fact that x''^' = x for all x G F.) We shall say that a function / : F" — >■ C is a 
polynomial phase of degree at most d if it takes the form / = ef{P) for some P G ^^(F"), 
or equivalently if all {d+ l)*^* multiplicative derivatives A^^ . . . A^^^^f are identically 1. 



It is of interest to test for the property that a function P : F" — > F is "close to" 
a polynomial of degree at most d, or to test for the closely related property that a 
function / : F" — C "correlates" with a polynomial phase of degree at most d. One 
proposal to perform such a test goes by the name of the Inverse Conjecture for the 
Cowers norms (see e.g. [HI [121 IH])) which roughly speaking asserts that a function 
/ correlates with a polynomial phase of degree at most d if and only if the {d + 1)*^* 
multiplicative derivatives of / are biased. To describe this conjecture more precisely, 
we need some further notation. 

Definition 1.1 (Gowers uniformity norm). [8], [9] Let / : F" — C be a function, and 
let d ^ be an integer. We then define the Cowers norm \\f\\^d+l of f to be the 
quant itjill 



\'Eh,,...,ha,x€¥"Ah^ . . . Aha^J{x)\ 



l/2'*+i 



thus ||/||(7ti+i measures the average bias in {d + 1)*^' multiplicative derivatives of /. We 
also define the weak Cowers norm ||/||„ti+i of / to be the quantity 

:= sup |E^gF"/(a;)eF(-Q(x))|, 

thus measures the extent to which / can correlate with a polynomial phase of 

degree at most d. 

Remark. It can in fact be shown that the Gowers and weak Gowers norm are in fact 
norms for d ^ 2 (and seminorms for d = 1), see e.g. P, [IS]- Further discussion of these 
two norms can be found in 1121. 



The Gowers norm and weak Gowers norm are closely related; for instance, one easily 
verifies the invariance 

ll/^llc/rf+i = and ||/^|Ud+i = (1.1) 



""^Hcre, as in all our papers, the expectation notation ¥.xes refers to the average ^ Sa;es o^'^r some 
finite non-empty set S. In this particular example, S = (F")''+^. 
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for all polynomial phases g of degree at most d, and from this and the Cauchy-Schwarz- 
Gowers inequality (see e.g. ^Jj) one can also verify the bound 

ll/IU^+i ^ (1.2) 

whenever / is bounded in magnitude by 1. In the converse direction the following had 
been suggested, and was stated formalljl^ in [151 [IE]- 

Conjecture 1.2 (Inverse conjecture for the Gowers norm). Let d 0, let 6 ^ (0, 1], 

and ¥ be a fixed finite field. Suppose that f : ^ C is a function with |/(a;)| ^ 1 for 
all X G F" and for which ||/||t/d+i ^ 6. Then ||/||«d+i ^d,5,F 1; that is to say, there is 
some c = c{d, 6,¥) > such that ||/||„d+i ^ c. 

This conjecture has been verified in a number of special cases. For instance the case 
= is trivial, and the case ci = 1 is easily established by Plancherel's theorem. The 
case d = 2 was established odd characteristic in [12] and in the case |F| = 2 (which 
is of particular interest in theoretical computer science) in [16]. The case when 6 is 
sufficiently close to 1 (depending on d and F) was established in [3j (see also the earlier 
related work of [S] in the case d = 1, and |r3| in the case when |F| is assumed large 
compared to d and S). 

One of our results in this paper establishes a further special case of the conjecture, when 
the function / is itself a polynomial phase, and the characteristic of F is not too small. 

Theorem 1.3 (Inverse conjecture for polynomial phases). Suppose that ^ d,k < \¥\, 
and that 6 G (0, 1]. Let P : F" — > F fee a polynomial of degree k, write f{x) := ef{P{x)), 
and suppose that ^ S. Then we have ||/||ud+i S>f,5 1- 

Note carefully the lower bound on the characteristic |F| of F. It turns out that some 
such restriction is necessary, and indeed that Conjecture 11.21 is false without some mod- 
ification. This is elucidated by the following example, which we shall analyse in §101 
For any d ^ and any vector space F"-, let Sd G Vd{¥"') be the symmetric polynomial 
of degree d: 

Sd{Xi, . . . , Xn) '■= ^ ^ Xi-^ . . . Xi^. (1-3) 

l^ii<...<ia^n 

Theorem 1.4 (Counterexample for the f/^-norm in F2). Let n be a large integer. Then 
the function / : Fg 1} defined by f := eF2(>S'4) = (—1)'^'' is such that 

\\f\\h^ = l + 0{2-/') (1.4) 

but such that 

||/|U4«(logn)-^ (1.5) 

for some absolute constant c > 0. 

This counterexample was discovered independently by Lovett, Meshulam and Samorod- 
nitsky [15j. They obtain a very much stronger bound for the lack of correlation of / 
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with a cubic phase, namely ||/||m4 -C 2^^". We obtain our bound by a very slight mod- 
ification of Ramsey-theoretic arguments of Alon and Beigel J2j . We will in fact be able 
to establish similar results with S4 replaced by for j ^ 2; see Theorem 111 .31 The 
aforementioned paper of Lovett, Meshulam and Samorodnitsky goes further in estab- 
lishing counterexamples to Conjecture 11.21 for all prime fields F = F^; specifically, the 
conjecture fails when d + 1 = p"^. 

We note that the counterexample presented in Theorem 11.41 is also a counterexample to 
the specific case of Conjecture 11.21 given as [HI Conjecture 21]. 

It seems of interest to determine for what other degrees. Cowers norms, and charac- 
teristics one has a counterexample of the above type, and to ask what can be salvaged 
when F is very small. We will speculate on these questions in ^ fTTl We do not regard 
Theorem II. 41 as an obstacle to the possible truth of the inverse conjecture over Z/iVZ on 
which our programme to count solutions to linear equations in primes depends (cf. [13j). 
Indeed this seems to be a "low characteristic" issue, albeit one of a rather interesting 
nature. 



We turn now to a discussion of the main technical result of the paper, on which the 
proof of Theorem 11.31 depends. We begin by defining the notion of rank. 

Definition 1.5 (Rank). Let d ^ 0, and let P : F" ^ F be a function. We define 
the degree d rank rank(;(P) of P to be the least integer k ^ for which there exist 
polynomials Qi, . . . ,Qk € 'Pd(F") and a function P : F*^ — > F such that we have the 
representation P = P(Qi, . . . , Qk)- If no such k exists, we declare rankrf(P) to be infinite 
(since F" is finite-dimensional, this only occurs when d = and P is non-constant). 



In the low-degree case, it is well known that the bias Kx^w^^wiPix)) of a polynomial 
phase e]^{P{x)) is closely related to the rank of P. For instance, if P G Pi(F") is 
linear, then from simple Fourier analysis we see that E^.girneF(P(x)) has magnitude 1 
if ranko(P) = and magnitude otherwise. For quadratic polynomials, we have the 
following well-known fact: 

Lemma 1.6 (Causs sum estimate). If P E V2{^^), then 

|E,eF"eF(P(x))|«|F|-"-°'^^(^) 
where c> is an absolute constant. 



Proof. If P G Pi(F") then the claim can be verified by Fourier analysis, so we can 
assume that P Pi(F"). We begin with the easy case |F| > 2, and then discuss the 
changes needed to handle |F| = 2. 

Suppose that 

|E,.gF"eF(P(x))| ^ (5 (1.6) 
for some < 5 < 1/2. It will suffice to show that ranki(P) ^ logpi |. 
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Squaring fll.6l) . we conclude that 

6^ ^ E,,,eF"eF(P(a;) - P{y)) = E,^h&-e^{DhP{x)). 

From Fourier analysis, we see that the average Kx(zW"ew{DhP{x)) vanishes unless DhP G 
7^0 (IF"), in which case it has magnitude 1. Thus the assumption fll.6p implies that 

Now by breaking up P into monomials, we can express P{x) = B{x, x) + L{x) for some 
bilinear form 5 : F" x F" ^ F and some L G Pi(F"). In the odd characteristic case 
|F| > 2, we can take B to be symmetric. We conclude that 

DhPix) = 2B{x, /i)(mod Po(F")), 

and hence that 

Fhe¥"{B{x, h) = for all x G F") ^ 6^. 

If 5^ > 1/|F| then this forces B to vanish identically, which contradicts the hypothesis 
P ^ Pi(F"), so we may assume 6"^ ^ Then the linear transformation associated to 

B has rank at most Oilog^f] V*^); since P{x) = B{x, x) +L{x), we conclude ranki(P) ^ 
log|]jr| 1/5 as desired. 

Now we consider the even characteristic case |F| = 2, in which case we cannot take B 
to be symmetric. Then the above argument gives 

Fh&n{B{x, h) = for all a; G F") ^ 6\ 

where B{x,h) := B{x,h) + B{h,x) is a symmetric bilinear form. Thus B must have 
rank 0(log2 1/5). By linear algebra we can thus express 

B{x,h)= ^ CijLi{x)Lj{h) 

for some k <^ logg 1/S, some linearly independent linear functionals Lj : F" F, and 
some coefficients Cij G F. Since B is symmetric and the are independent, we have 
B[x,x) = B[x,x) + B[x,x) vanishes in characteristic 2, we also see 
that Ci^i = 0. We can thus write 

B{x,h) = C(x, h) + C{h,x) 

where C{x, h) := J2i<i<j<k^i,j^i(^)^jW lower-triangular component of B{x, h). 

We then easily verify that B{x,x) — C{x,x) is a linear function of x, and so P{x) 
can be expressed as the sum of C {x, x) and a linear function, from which the claim 
ranki(P) < log^ 1/5 follows. □ 

We shall establish the following generalisation of the above estimate to higher degree 
polynomials, provided that the degree does not exceed the characteristic: 

Theorem 1.7 (Lack of equidistribution implies bounded rank). Suppose that an integer 
d satisfies ^ d < |F|. Let 5 G (0,1], and suppose that P G P(f(F"') is such that 
|E^.g]FneF(P(a;))| ^ 5. Then rank(i_i(P) <^¥Ad 1- 
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The proof of this theorem is the technical heart of the paper, and will be accomplished 
in ^ It is possible that the restriction on |F| can be removed, but our method of proof 
breaks down when |F|. Certainly the deduction of Theorem 11.31 from Theorem 11.71 
breaks down in this case (which of course it must, thanks to Theorem 11.41) . 

Acknowledgements. The authors are indebted to Andrej Bogdanov, Tali Kaufman, and 
Emanuele Viola for suggesting this problem, and for many useful discussions. The 
authors also thank Alex Samorodnitsky for drawing attention to the recent preprint 
|15] . and to Peter Sarnak for suggestions. 

2. Factors and regularity 

In this section we give some definitions and results which will be useful in our proof of 
Theorem 11.71 

Definition 2.1 (Factors and configuration space). Suppose that ci ^ is an integer 
and that Mi,...,M(i are further non- negative integers. By a factor of degree d on 
F" we mean a collection JF = (Pij)i^i^d,i<j<Mi where Pij G 'Pj(F'^) for all By 
the dimension dim(jF) of we mean the quantity Mi + • • ■ + M^. Write Ti for the 
i-degree part of JF, that is to say the collection {Pi,j)i<^j<^Mi- Although we are using 
the term factor to describe nothing more complicated than a collection of polynomials, 
we encourage the reader to think in addition of the cr-algebra cr(jF) defined by these 
polynomials Pjj-, that is to say the partition of F" into atoms of the form {x : Pij{x) = 
Cij}. We write S = F*'^^ x . . .F*''^'' and call this the configuration space of JF. We write 
$ : F" S for the evaluation map given by $(a;) = {Pij{x))i^i^d,ii^ji^Mi- 

We will use the notation of this definition throughout the paper without further com- 
ment. Sometimes we will have factors JF, JF' and JF"; we will write Pij, P/j, P/j, S, S', S", 
Mj, Mj, Mj, $, $" and so on for the corresponding polynomials, configuration spaces, 
dimensions and evaluation maps. 

We will frequently need to extend a factor into a more regular one, by expressing the 
complicated polynomials in a factor by simpler ones. Our notation for this concept 
is as follows. We say that a factor JF' is an extension of JF if a{T') is a (possibly 
trivial) refinement of cr(jF). Note that this is not the same thing as saying that the 
collection (P/j) defining JF' contains the collection (Pjj) defining T. For example, the 
factor defined by the linear polynomials refinement of that defined by the 

polynomials Xi, X2 and Xi + X2- 

By a growth function of order d we mean a non-decreasing function F : Z+ 

Definition 2.2 (P-regularity). Let JF be a factor of order d, and let P be a growth 
function. We say that JF is F -regular if we have 

ranki_i(^QjPij) ^ P(dim(J^)) 
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for all 1 ^ i ^ (i and all coefficients Cj^i, . . . , Cj^M, ^ that are not all zero. (In 
particular, if F is positive, this implies that the polynomials Pj^i, . . . , Pi^M, are linearly 
independent.) 

Example. If d, F and Mi, . . . , Md are fixed, and Pij are chosen uniformly at random 
from ^^(F"), then the resulting factor JF will be F-regular with probability 1 — o(l), 
where o(l) goes to zero as n cxd for fixed d, F, Mi, . . . , M^. Indeed, one should view 
the polynomials in an F-regular factor as "behaving like" generic polynomials, in that 
they obey no unexpected algebraic constraints of bounded complexity. o 

The following lemma, which allows us to replace take an arbitrary factor JF and find 
a highly regular extension of it, is absolutely fundamental to our arguments. This 
generalises [HI Lemma 8.7] to the case of factors of degree 3 or more. The result is 
faintly analagous in some ways to Szemeredi 's regularity lemma for graphs and to more 
recent versions of this for hypergraphs. 

Lemma 2.3 (Regularity lemma). Let d ^ 1, let F be a growth function, and let T 
he a factor of degree d. Then there exists an F -regular extension T' of T of degree d 
satisfying the dimension bound 

dim(JF') <t:F,d,dim(j^) 1. 

Remark. The actual bound we obtain here, if one worked it out, would have an extremely 
weak dependence on F, d and dim(jF). Even for quite "reasonable" growth functions F 
one starts to see functions in the Ackerman hierarchy making an appearance. It is our 
dependence on this lemma and the rather poor bounds that result from its proof that 
renders Theorem 11.71 essentially ineffective. 

Proof. Fix d and F. We shall induct on the dimension vector (Mi, . . . , Ma) of JF where, 
of course, Mj := dim(jFj). This dimension vector takes values in Z^, which we shall 
order in reverse lexicographical ordering, that is to say (Mi, . . . , Mif) < (M(, . . . , M^) if 
there exists 1 ^ i ^ d such that Mj < and Mj = Mj for all i < j ^ d. This turns 
Z^^ into a well-ordered set (with the ordinal type a;'^), and so we can perform strong 
induction on this space. In other words, we may assume without loss of generality that 
the claim has already been proven for all smaller dimension vectors. 

If !F is already F-regular, then we are done. Otherwise, there exists i G [d] and a non- 
trivial linear combination Qi of the Pj_i, . . . , Pi^Mi such that rankj_i((5i) < -F(dim(jF)), 
or in other words Qi is some combination of fewer than F(dim(jF)) polynomials of 
degree at most i — 1. By rewriting Qi in this fashion, we can find an extension JF" of JF 
with dimension vector 

(Ml, ... , M,_i + [F(dim(^))J , M, - 1, M,+i, . . . , M,) 



(with some obvious modifications in the easy case i = 1). Applying the induction 
hypothesis to JF" we obtain the claim. □ 
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3. A LEMMA GF BOGDANGV AND ViGLA 



In this section we recall P, Lemma 25], and provide a proof in the interests of self- 
containment. This lemma almost immediately establishes our main result, Theorem 
II. 7[ except for the presence of some small errors. Our main task in subsequent sections 
is to eliminate the errors and turn this near-miss result into a proof of Theorem 11.71 

Lemma 3.1 (Bogdanov- Viola lemma). Let d be an integer, and let 6,a E (0, 1] be 
parameters. Suppose that P G Pd(F") is a polynomial of degree d such that 

Ke¥"ef{P{x))\ ^ 6. (3.1) 

Then there exists a function P : F" — > F with rankrf_i(P) ^ such that Pi.gFn(P(a;) 7^ 
P(x)) ^ a. 



Proof. We remark that the bound on rankrf_i(P) is much superior to that we will 
eventually obtain for Theorem 11.71 This is because the Bogdanov- Viola lemma does not 
rely on the regularity lemma. Lemma [2.3[ In fact this bound could even be improved 
somewhat, but this is not relevent to our work here. 

For each r e F, define a measure fir '■ ^ [0, 1] by setting 

/i,(t) =F^^^n{P{x)=t + r) 
for all t G F. Then (13.1 p implies that | J2te¥^^(^)^^o(^)\ ^ ^- Noting that 

^eF(t)/io(^) = ef{d)^ewit)ndit), 
te¥ t 

we see that 

ll/io - /idll := Yl l/'o^^) - -"^Wl ^ |1 - Md)\\J2ef{t)fio{t)\ ^ A6/\¥\ 
t t 

if d 7^ 0, by dint of the inequality |1 — e^''*^! ^ 4:\6\ which holds when |^^| ^ 1/2. By 
translation invariance we conclude that 

llyUr -/isll > 4:6/\¥\ (3.2) 

whenever r ^ s. 

Now fix a value of x and let /i G F" be chosen at random. Then 

MDhPix) = t)= Ff,{P{x + h)=t + P{x)) = fip(^,){t), 

that is to say D^P^x) has the distribution /ip(x). Now we expect that if a large number 
Dhj^P{x), . . . ,Dfi^P{x) of points are sampled from this distribution then the observed 
distribution 

1 

/iobs(^i, ■ ■ ■ , hk] x) := - ^ Sd^.p(x) 

i=l 

should approximate [ip{x)- In view of the separation property (13.21) . this ought to give 
us a good chance of recovering P{x). 



POLYNOMIALS OVER FINITE FIELDS AND COWERS NORMS 



9 



Choose k ^ sample hi, . . . ,hk independently at random from F". Motivated 

by the above discussion, we define Phi,...,hh{x) to be that value of r G F for which 
||/^obs(/ii, ■ ■ ■ , hk] x) — fir\\ is minimal. Note that Phi,...,hk is measurable with respect to 
the set of functions Dh^P{x), . . . , Dh^P{x), each of which is a polynomial of degree at 
most d — 1. Thus 

rankrf_i(P,,^,...,/jJ ^ k. 

It remains to show that, at least for some choice of hi,...,hk, the function Phi,...,hk 
approximates P. Now if Phi,...,hk{^) 7^ P{x) then it follows from the separation property 
([S2D that 

\\^ohs{hi,...,hk,x) - ^ip{x)\\ > 25/|F|. 

We claim that for fixed x the probability of this happening (over random choices of 
hi, ... , hk) is at most a. Summing over x, it then follows that there is at least one 
choice of hi,...,hk for which ^{x : P{x) ^ Phi,...,hk{^)} ^ cr|F"|, and the lemma 
follows upon taking P := Phi,...,h^.- 

Fix X G F" and a value of t G F, and write Yi = lD^.p(x)=t- To establish the claim, it 
suffices to show that 

Noting that the are i.i.d. Bernouilli random variables with means Y = fip(x){t), this 
follows from a suitable version of the law of large numbers. In this case we may use the 
inequality 

which follows from Chebyshev's inequality. □ 

Remark. When |F| = 2, the above proof has a pleasant interpretation. The value of 
Phi,...,hk{x) is then obtained by "majority vote" amongst the values of DhiP{x). 



4. Counting lemmas 



We shall prove Theorem 1 1.71 by induction. Accordingly, we begin by first describing some 
consequences of Theorem 11.71 at a given order d, which are already of some independent 
interest. These consequences complement the regularity lemma in much the same way 
that "counting lemmas" in graph theory complement the Szemeredi regularity lemma. 

Lemma 4.1 (Size of atoms). Let d ^ 1, and e > 0. Suppose that Theorem \1.7\ is true 
for orders up to d. Then there exists a growth function F {depending on d and e) such 
that if T is an F -regular factor of order d on F" then we have the estimate 

P,.eF"($(x)=t) = (l + 0(£))^ (4.1) 

for all configurations t eT^. In words, all the atoms in the a -algebra cr[T) have roughly 
the same size. 
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Remark. Recall that S = F^^^ x ■ ■ ■ x F^'^'* is the configuration space associaed to the 
factor JF, and that $ : F" S is the evaluation map. 

Proof. We may expand the condition =t using Fourier analysis on S to obtain 

P,($(a;) =t) = ^ Y^ ExeF"eF(r ■ - t)). 

' ' res 

It therefore suffices to show that 

d 

E.eF«eF(^g.) = 0{e/\¥f^^^^) (4.2) 

i=l 

whenever the Qi G Span(jFj) are not all zero. Let s G [d] be the largest integer for which 
Qs is non-zero. As JF is F-regular, we have ranks_i(Qs) ^ F(dim(jF)). On the other 
hand, Yl'i=i Qi differs from by an element of Vs-iiV). Thus 

d 

rank,_i(^Qi) ^ F(dim(^)) - 1. 

i=l 

If we choose F to sufficiently rapidly growing depending on e and we can thus invoke 
Theorem 11.71 to obtain fl4.2p as required. □ 

In addition to understanding the distribution of it turns out to be important to 

have an understanding of how fc-dimensional parallelepipeds are distributed in config- 
uration space. That is, we study the distribution of + c<j ■ /i))^g{o,i}fc in 
where h = {hi, . . . , hk) is a fc-tuple of elements of F". When k = 2, for example, we are 
interested in the 4-tuple ($(x), $(x -|- /ii), $(x + /i2), + /ii + /i2)). We prepare the 
ground for this study with some definitions. 

Definition 4.2 (Faces and lower faces). Let ^ 1 be an integer and suppose that 
^ k' ^ k. A subset F C {0, l}'^ is called a face of dimension k' if it has the form 

F = {uje {0, l}'^ -.00^ = 6^ for i G /}, 

where / C [k] has size k — k' and each 5i is either or 1. If all of the 5i are zero then 
we say that F is a lower face. A lower face of dimension k' can be identified with the 
power set of [k] \I, which is a set of size k'. 

Suppose that we have a parallelepiped {x + u ■ /i)(^g{o,i}fc in IF", where h = {hi, ... , hk) 
is a fc-tuple of elements of F". Consider the image ($(x + uo ■ /i))^g{o,i}fe £ Y^i^^"^)^ , 
This cannot be arbitrary: indeed we have the "obvious" constraints coming from the 
relations 

^(-l)l"lp,,,(x + cu-/i) = 

whenever F C {0, l}'^ is a face of dimension at least i + 1, and Ic^l := uoi + . . . + uj^. To 
model these obvious constraints, we introduce some more notation. 

Definition 4.3 (Face vectors and parallelepiped constraints). Suppose that G [d], 
that jo e [MiJ and that F C {0,1}'=. Consider the vector r{io,jo,F) G for 
which rij{uj) = (— l)'"^' if z = iQ, j = jo and to G F, and is zero otherwise. We call 
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such a vector a face vector. If F is a lower face then we speak of a lower face vector. If 
dim(F) zq + 1 we say that the face vector (or lower face vector) is relevant. We say 
that (t{to))^iz{Q iyk E S'f^'^J' satisfies the parallelepiped constraints if it is orthogonal to 
all the relevant lower face vectors. 

Remarks. The motivation for this definition, of course, is that for any x, hi, . . . , hk the 
vector + u ■ /i))c^e{o,i}fc ^ T.^^'^^ satisfies the parallelepiped constraints. At first 
sight the fact that we have restricted attention to lower face vectors may look curious. 
However it turns out (and is not hard to prove) that the set of relevant face vectors in 
^{0,1} jg spanned by the relevant lower face vectors. We will not require this fact. 

Write Eg C for the subspace of vectors in S'^'^'^^'^ satisfying the parallelepiped 

constraints. 

Lemma 4.4 (Dimension of Sn). Suppose that k > d. Then we have 

d 

dim(Sn) = ^M, 

Proof Since dim(SiO'i>') _ 2^'(Mi + ■ ■ ■ + M^) = Eti^iEjC'), suffices to 
show that the dimension of the space spanned by the relevant lower face vectors is 
Yl'i=i X^j>i if) ■ This is precisely the number of different relevant lower face vectors, 
and so we must only show that the lower face vectors are linearly independent. To 
do this, we may clearly work with a fixed choice of i and j, since the supports of the 
face vectors r{i,j,F) are disjoint for different pairs {i,j). Suppose there is some linear 
relation 

F 

Among all lower faces F for which ap 0, suppose that Fq contains the largest element 
Uq in the lexicographic order on {0, 1}'^. Comparing coefficients of ujq we see that 
apf^ = 0, contrary to assumption. □ 

If the factor JF is F-regular for some sufficiently rapid growth function F, it turns out 
that the parallelepiped constraints we have written down are the only relevant ones in 
a rather strong sense. 

Proposition 4.5 (Counting parallelepipeds). Suppose that |F|, A; > d, and suppose that 
Theorem \1.1l\ is true for orders up to d. Let e G (0, 1) be a parameter and suppose that 
F grows sufficiently quickly {depending on k,d and e). Suppose that the factor T has 
degree at most d and is F -regular. Suppose that tu G Sn, and that x is a point 

with $(a;) = tn(0). Then the number of h E (F")'^ such that $(x + uj ■ h) = tn(u;) for 
all Lu G {0, l}'^ isl + Okie) times |F| to the power nk - M Ei<j<i Q) ■ 

Remark. Note carefully that we have been able to fix the basepoint x; this is important 
in applications of the proposition. This is why j now only ranges from 1 to i rather 
than from to z as in Lemma 14.41 
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Proof. Write $n(/i) for the vector (<l>(a; + uj ■ /i))(^g{o,i}fe in S'^^'^J'*'. We seek the number 
of h for which = to', by harmonic analysis on S'^^'^^* this may be expanded as 

|p|nfc|^{o,i} J2 ^he(F^)Mro ■ i^oih) - to)). (4.3) 

Now when ro hes in the space W spanned by the relevant lower face vectors together 
with the vectors r(i,j, 0) we have ro - {^o{h) — to) = 0, since both $□(/;,) and to satisfy 
the parallelepiped constraints and $□(/?-) (0) = tn(0). Since the lower face vectors are 
linearly independent the contribution from these ro to the sum fl4.3p is |F| to the power 
nk — ^i^i Mi X]i<j<i (j)- To conclude the argument it certainly suffices to show that 
the contribution from each ro ^ W is small in the sense that 

|E,e(ip.).eF(rn ■ $n(/i))| ^ e\¥\-^' '''^^^l (4.4) 

Such an exponential sum is unaltered in magnitude if an arbitrary element of W is 
added to ro. By repeated operations of this type, directed so as to reduce the largest 
element in the a;-support of each {ro{^^))i,j in the lexicographic order on {0, 1}'^, we may 
assume that iro{uj))ij = unless Ic^l ^ i. Since ro is not in W, there is at least one 
choice of i,j and at least one 7^ for which {ro{uj))ij 7^ 0. Amongst all such triples 
(z, j, Lj), choose one with the largest value of i, say i = Iq. For this value of z = zq choose 
{jo,uJo) with s = \u!o\ maximal, still subject to the condition that (rn(a;o))io,jQ 7^ 0. Note 
that 1 ^ s ^ z. By relabelling the cube {0, 1}*^ we may assume that ujo = 1*0*^"*. By 
construction, any triple {i,j,u) satisfies one of the following properties: 

(i) i > io and u = 0; 

(ii) i = io and u = Uq] 

(iii) i = io and at least one of the coordinates ui, 1 ^ / ^ s, is zero; 

(iv) i < io. 

Since 1 ^ s ^ i ^ k the sum in (14.41) may then be written as an average (over 
hg^i, . . . , hfc) of sums of the form 

lEfci,...,h.eF(P(a; + /ii H \- h^) + Q{hi, . . . , h,)), 

where P is not zero and lies in Span(jFj), and Q has degree at most s — 1 as a polynomial 
in hi, . . . ,hs. Such a sum may be written as 

lE/ii,...,/i,bi(/i2, ...,hs)... bs(/ii, . . . , hs-i)eY{P{x + hi -\ h hs)), 

where each b is a bounded function which does not depend on hi. By introducing 
dummy variables we may assume that s = i. Applying the Cauchy-Schwarz inequality 
i times to eliminate the bounded functions b, we see that the sum in (14. 4p may be 
bounded thus: 

|E^g(F.)„eF(rn ■ $n(/i))| ^ {E,,^_hMDh, • . . Dh,P{-))Y^^\ 
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Note that this derivative is, for fixed hi,...,hi, simply a constant; we write it as 
d'P{hi, ...,hi). It follows that if (jO]) is false then 

\E,,_,Md'P{hi,...,k))\ ^ (£|Fr2'=dim(E))2»_ 
Applying Theorem 11.71 at degree i ^ d and with V = (F")* we see that 

ranki_i(9*P) <^k,e,dim{E) 1- 
Note however that we have the Taylor expansion 

P(x) = ^d'P(x,...,x) + Q(x) 

for some polynomial Q of degree at most i — 1 (this is the only point in the whole paper 
where we use the assumption that \¥\ > d i, in order to ensure invertibility of i!). It 
follows that 

ranki_i(P) <fc,e,dim(s) 1- 
This contradicts the F-regularity of the factor JF if F is assumed to grow sufficiently 
rapidly. □ 



5. Proof of Theorem 11.71 



In this section we complete the proof of Theorem II. 7[ Our starting point is the lemma 
of Bogdanov and Viola, stated as Lemma 13.11 in this paper. We urge the reader to 
recall the statement now. In view of that lemma, it suffices to establish the following 
proposition. 

Proposition 5.1 (Polynomials which are almost low-rank are low-rank). Suppose that 
d ^ 1 is an integer, and that Theorem | i. ?| holds for all orders up to d — 1. Let ad > 
be a small quantity to be specified later. Suppose that P G Vd{¥"') and that T is an 
F -regular factor of degree d — 1. for some growth function which grows suitably rapidly 
in terms of d. Suppose that P : ¥ is an J-" -measurable function and that P(P(x) = 

P{x)) ^ 1 — ad- Then P is itself J-' -measurable. 

Proof of Theorem \l.l\ assuming Proposition \5.1[ This is almost immediate. By induction 
we may fix d ^ 1 and assume that Theorem 1 1 . 71 holds for all orders up to (i — 1. Take the 
function P appearing in the conclusion of Lemma [3?T1 By construction, P is measurable 
with respect to some factor JFq of degree at most d — 1 and dimension no more than 
|F|^/5^(T. By Lemma [2731 we may extend JFg to a factor JF which is F-regular and satisfies 
dim(^) <^F,d,5,F 1- The function P is manifestly jF-measurable, and so the result follows 
upon applying Proposition 15.11 □ 

Proof of Proposition \Un\ We use the same notation for the factor T that was introduced 
in Definition 12. 1[ In particular this factor is defined by polynomials Pjj- G Pi(F"): these 
should not be confused with the polynomial P which is the subject of Proposition 15.11 

For the purposes of an initial discussion write X for the set of points in F" for which 
P{x) = P{x), thus |X| ^ (1 — crd)|lF"|. The key idea is that we may use {d + 1)- 
dimensional parallelepipeds in X to create new points x' for which P(x') does not 
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depend on which atom of T the point x' hes in. There are two procedures we might 
use: 

1. Completing atoms. Suppose that x,hi, . . . , hd+i are such that all 2'^"'"^ points x + uj-h 
lie in the same atom A of Suppose in addition that x + uj-h G X whenever 7^ 0. 
Then using the relation ^^(— 1)''^'-P(2; + uj ■ h) =0 and the fact that P is constant on 
y4, we see that x also lies in X. 

2. Creating new atoms on which P is constant. Suppose that A is an atom of cr(jF) 
such that there are atoms A^, uj G {0, 1}'^"'"^ \ 0'^"'^^ with the following property. For any 
X e A, there are hi, . . . , hd+i G F" such that x + u ■ h e A^ for all u e {0, 1}'^+^ \ 0. 
Then if P is constant on each of the Ai_j, it is also constant on A. This follows from the 
relation + to ■ h) = once again. 

It is in fact possible to perform Procedures 1 and 2 simultaneously, but the exposition 
is fractionally clearer if the urge to do this is suppressed. 

Let us start with an analysis of Procedure 1. It is easy to see using Lemma HH] that for 
1 — 0{^yad) of the atoms in B we have Px^a{P{x) = P{x)) ^ 1 — 0{y/ad). We say that 
P is almost constant on such atoms, and our task is to show that P is actually 100% 
constant on each such atom. 

Suppose that P is almost constant on the atom A = and write A' ^ A for the 

set where P = P. 

Lemma 5.2 (Avoiding bad parallelopipeds) . Let the notation and assumptions be as 
above. Suppose that cr^ is chosen sufficiently small. Fix an x E A. Then there is h so 
that all of the vertices x + uj ■ h, uj ^ 0'^"'""^, lie in A' . 

Proof. Let A^n(x) denote the number of parallelopipeds {x + uj ■ h)^^^Q iyd+i, all of whose 

vertices lie in A. The vector (t,t, . . . ,t) G trivially satisfies the parallelepiped 

constraints, and so by Proposition 14.51 we have 

Na{x) ~ |Fr(^+^)-^tiM.E,,,,,(r) (5.1) 

if F is sufficiently rapidly growing. 

The number N\j{x) of parallelopipeds in A is thus quite large. Unfortunately, this does 
not immediately imply that the number of paralleopipeds in A' is large, as the A^n(x) 
parallelopipeds in A may all be intersecting the small set A\A'. However, it will turn out 
that such a concentration in A\A' can be picked up via the Cauchy-Schwarz inequality, 
as it will force into existence an anomalously large number of pairs of parallelopipeds 
that share an additional vertex in common besides x. The main difficulty in the proof 
then lies in counting number of such pairs properly. 

We turn to the details. It suffices to show, for each fixed Uq G {0, 1}^^+^ \ O''^^, that the 
number of parallelepipeds {x + u ■ h)^^^Q iyd+i, all of whose vertices lie in A, and with 
X + Uq ■ h E A \ A', is less than 2"'^^'^ Na{x) . The number of such "bad" parallelepipeds 
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may be written as 

x+uJo-h=ui 

u h 

and we may use the Cauchy-Schwarz inequality to bound this above by 
\A\A\^'^\{{h,h') : x+u}-h,x+u}'-h' e A for all u,uj' G {Q,lY+\x+UQ-h = x+u}o-h'}\^'^ . 
Thus if (Td is chosen so small that |y4 \ y4'| ^ 2~^"'^^|y4|, it suffices to show that 
\{{h, h') : X + u ■ h,x + uj' ■ h' e A for all uj,uj' G {0, Ij'^+^x + Uq ■ h = x + ujq ■ h'}\ 

<^(l + 0(.), 

(5.2) 

for some sufficiently small e > 0. 

By relabelling the cube {0, 1}°'"'"^ if necessary, this may be recast as the problem of 
counting the number of h, h' G (F")'^+^ satisfying the constraint 

hi + --- + h, = h\ + --- + h'^ 

and for which the two parallelepipeds 

Ui:={x + u- /i)^e{o,i}d+i 

and 

□2 := (x + u; ■ /i')<^e{o,i}d+i 
lie in A. Substituting fl5.ll) and the approximate size of 1^41 (cf. Lemma [4 .11) into fl5.2p . 
we see that our task is to establish that the number of such /i, h' is at most 1 + 0(e) 
times |F| to the power n{2d + 1) + ^ti - 2 T.i^,^^ ))• 

The parallelepipeds Di and ^2 share the common vertices x and x + hi + ■ ■ ■ + hs. Note 
that Di and ^2 may be embedded inside a {2d + l)-dimensional parallelepiped 

□ := (x + t^-?/)^g{o,i}2'*+i, 

where 

y '■= (hi, . . . , hs_i, hg — h[ — ■ ■ ■ — h'g_i, hg+ij • • • , /^d+i, ^'1, • • • , ^s-i, ^s+i; • • • 5 ^d+i)- 
Thus, writing Di corresponds to the indices 

uj G {0, 1}'^+^ ■ (ei, . . . , e,_i, e, + e^+a H h Cd+s, e,+i, . . . , e^+i) (5.3) 

and ^2 to the indices 

u G {0, lY^^ ■ {ed+2, ed+s, ei H h e^, e^+^+i, . . . , 62^+1) (5.4) 

where we use the usual dot product (cui, . . . , u!d+i)-{vi, . . . , Vd+i) := cuif i + . . .+LJd+iVd+i- 

Suppose that i G [d\ and j G [Mi]. Then Pij{x + u ■ y) is a polynomial of total degree 
at most i in ui, . . . , uj2d+i- Using the fact that u = = = . . . for G {0, 1}, we see 
that there exists a polynomial Qij : Z^'^^^ — > F with total degree at most i and degree 
at most 1 in each of cui, . . . ,uj2d+i with the property that 

Pij{x + uj- y) = Qijiuj) 
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for uj G {0, 1}^"'+^. In fact this extension is unique, as the following lemma shows. 

Lemma 5.3 (Extension lemma). Suppose that Q : ^ ¥ is a polynomial in variables 
Xi, . . . ,Xk of total degree with degree at most one in each Xj. Suppose that Q{xi, . . . , Xk) 
is equal to zero for (xi, . . . , Xk) € {0, l}'^. Then Q = identically. 

Proof. This appears, for example, as [H Lemma 2.1]. We proceed by induction on fc, 
the result being trivial when k = 1. We may write 

. . . , Xfc) = R{xi, Xk-i) + XkS{xi, Xk-i) 

where both R and S have degree at most one in each xj. Noting that R{xi, . . . , Xk-i) = 
Q{xi, . . .,Xk-i,0) and that S{xi, . . .,Xk-i) = Qixi, . . .,Xk-i, 1) - Q{xi, . . .,Xk-i,0), 
we see that R{xi, . . . , Xk-i) = S{xi, . . . , Xk-i) = for all Xj G {0, 1}. By the inductive 
hypothesis this implies that R = S = identically. □ 

It follows from Lemma TS. 31 (15.31) . (15.41) and the fact that Pij{\I\i) and Pij{\I\2) are fixed 
that Qij {uj) is fixed for uj in both of the d + 1-dimensional lattices 

A ;= Z'^^^ ■ (d, . . . , Cs-i, V, Cs+l, . . . , Cd+l) 

and 

A' := Z'^^^ ■ {ed+2, • • • , Cd+s, v, Cd+s+i, • • • > ^2d+i), 
where v G Z^^+^ is the vector 

f := ei H h + ed+2 H h Cd+s- 

A second application of Lemma 15731 noting that 2d > i, confirms that Qij is determined 
on 

Z^"* ■ (ei, . . . , e^_i, e^+i, . . . , e2d+i) + {0, 1} ■ v 

by its values on 

S := {0, 1} ■ (ei, . . . , Cs-i, Cs+i, . . . , 62^+1,^). 
In particular we see that Qij{uj), and hence Pij{x + uj ■ y), is determined for uj G 
{0, i^g values on 5". Since Qij has degree at most i we see that it is deter- 

mined on S by its values at arguments which are the sum of at most i elements from 
{ei, . . . , Cs-i, Cs+i, . . . , e2d+i, v}. 

Of the ^o^i<j C'^j^^) possible choices for the values of the polynomials Qij at these 
arguments, 2 XlosSi^j ('^j ^) ~ 2 of them are already fixed for us since Qij is fixed in both 
A and A'. It follows that the number of choices of {Pi,j{x + uj ■ ?/))ije{o,i}2d+i is at most 
|F| to the power 1 + X]i<j<i C'^j'^) ~ 2(°'^^)- Summing over i and j, it follows that the 
number of choices for $(□) subject to our constraints on $(ni) and $(02) is at most 
|F| to the power Eti ^.(l + Ei^,^. Tf) - ^{'f))- 

For each such choice the number of □ is, by Proposition 14. 5[ 1 + 0{e) times |F| to the 
power n{2d+l) — Mi (^'j''^)) and so the total number of □isl + 0(£:) times 

|F| to the power n{2d + 1) + Yli=i ~ 2 J^i^j^i ('^j^))' which is what we wanted to 
prove. This concludes the proof of Lemma [5. 2 [ □ 
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Recall that A' (1 A is the set of points where P{x) = P{x). Now A is an atom in the 
factor JF, which has degree d — 1, and P is a polynomial of degree d. We therefore see 
that if all the points x + u ■ h, u ^ {0, 1}'^"'"^ \ 0*^+^, lie in A' then so does x. It follows 
from Lemma [5.21 that A' = A. 

This completes the analysis of Procedure 1, and we find ourselves in the situation 
that P{x) = P{x) on 1 — 0(^0^) of the atoms in cf{T). Call these the good atoms. To 
perform procedure 2, we need only show that for any (bad) atom A = Aq there are good 
atoms A^, u G {0, lY+^ \ 0'^+\ such that the sequence of coordinates tu = ^{A^) G 
j]{o,i}'*+^ satisfies the parallelepiped constraints. To do this it suffices to find just a single 
parallelepiped [x + uj ■ h)^(zs^Q iY which all of x + • /i, a; G {0, Ij^^^^ \ O"^"^^, lie in 
good atoms. To see that this is possible, fix x G and pick hi, ... , hd+i at random. 
It is clear that for any fixed u ^ 0'^'^^, the probability that x + uj ■ h lies in a good atom 
is the same as the probability that a random element of lies in a good atom, which 
is 1 — 0{^/ad) by Lemma [4. II If ad ^ c2~'^'^ for sufficiently small c it follows that there 
is indeed positive probability that all of the x + u ■ h, u E {0, 1}'^"^^ \ 0'^+^, lie in good 
atoms. 

We have now successfully performed Procedures 1 and 2. By earlier remarks, this 
concludes the proof of Proposition 15.11 and hence, by the remarks at the start of the 
section, that of Theorem 11.71 □ 

6. Inverse theorems for the Gowers norm 

We can now give a fairly quick proof of Theorem 11.31 We begin with a preliminary 
result which is already of interest. 

Proposition 6.1. Suppose that |F| > + 1 ^ 2 and that 5 > 0, let P E Vd+ii'^"'), and 
write f{x) := ef{P{x)). Suppose that ||/||(7d+i ^ S. Then Ta.iakd{P) '^d,5 1- 

Proof. Write d'^^^P{hi, . . . , hd+i) '■= Dhi ■ ■ ■ Dh^^^Pix). Since P has degree d+1, this 
does not depend on x. From the definition of the U'^'^^ norm, we have 

Applying Theorem 11.71 we conclude that 

rank,(9^+ip) <^d,5 1. 
But since |F| > + 1 we have the Taylor expansion 

where degQ ^ d. Thus the rank of P is itself bounded by 0^,5(1), as required. □ 

Proof of Theorem \1.3\ . We fix d and induct on k. The cases k ^ d are trivial (since 
ll/ll^d+i = 1 in these cases), so we first verify the case k = d + 1. In this case, we know 
from Proposition 16.11 that rankrf(P) <tid,5 1, thus we can express f{x) = e^{P{x)) as 
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some function of Od^si^) polynomials of degree at most d. By Fourier analysis, we can 
therefore obtain a representation 

J 

i=i 

where J = 0^,5(1), Qj G Vd{^"'), and Cj are complex numbers of magnitude Od^s^l) for 
all j G [J]. It follows immediately that / has inner product at ^^,5 1 with at least one 
of the functions CfiQiix)), and therefore ||/||„d+i ^^^5 1 as desired. 

Now suppose that k > d and the claim has already been proven for polynomials of 
degree k. Suppose that P G Pfc+i(F"), that /(x) := ef{P{x)) and that H/Ht/d+i ^ S. 
By the monotonicity of Gowers norms (see e.g. [IQ, Chapter 11]) we have 

and thus by Proposition 16.11 we obtain 

rankfe(P) <^k,5 1- 

Let F be a growth function (depending on k, 6, d) to be chosen later. Applying Lemma 
12.31 we can find an F-regular factor JF of degree k and dimension 0^,^,^,5(1) such that 
P is measurable with respect to cr(jF). By Fourier expansion, we can thus express 

fix)= Yl CQ,^...,QkMQlix) + ■ ■ ■ + Qkix)) 

QieSpan(JFi),...,(3;jgSpan(J^fc) 

where the coefficients cg^^... q^, are complex numbers of magnitude at most B for some 
B = Ofc,dim(s)(l)- We may use this expansion to split / as /i + /2, where 

Mx) ■■= CQ^,...,Qd,o,-,oe¥{Qi{x) + . . . + Qd{x)) (6.1) 

QieSpan{:^i),...,QdGSpan(:^d) 

and 

f2{x) := Y CQ,_QMQii^) + ■■■ + Qk{x)). (6.2) 

Qi eSpan{:^i ) ,. . . ,Qfe eSpan(.:^fc ) 
Qsj^O for some s>d 

Thus /2 is the part of / which "genuinely has degree larger than d" . We shall show the 
C/'^+^-norm of this part is small. 

Suppose that polynomials Qi G Span(^i), . . . , G Span(^fe) are such that Qs is non- 
zero and Qs+i, . . . ,Qk-i all vanish for some s > d. Since JF is F-regular, we have 
ranks_i(Qs) ^ F(dim(jF)), and thus 

rank,_i(Qi + . . . + Qk - Q) > F(dim(^)) - 1 (6.3) 

for any Q G ^^(F"). Applying Theorem 11.71 and the induction hypothesis, we conclude 
(if F is large enough) that 

We^iQi + ... + (5fc)||c7fe+i ^ oDiT \ TFT- 
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Since the Gowers f/'^+^-norm obeys the triangle inequahty (see e.g. P, Lemma 3.9]), 
it follows that ||/2||;7'=+i ^ 5/2. Recalling that H/Ht/ft+i ^ S, another application of the 
triangle inequality implies that ||/i||(7fe+i ^ S/2. Now by Cauchy-Schwarz we have 

ii/iii^^.^ ^ ii/iii^ii/iiir-^- 

From the bounds on the Fourier coefficients cq^,...,Qj. we have ||/i||oo ^fe,dim(:F) 1, and 
therefore 

From fl6.1l) and the pigeonhole principle it follows that there exist Qi E J-'i, . . . ,Qd & 
such that 

\{fue^{Qi + ... + Qd))\>e 

for some e ^d,fc,5,dim(E) 1- On the other hand, from (16. 3p . Theorem ll.7[ and (16. 2p we 
have 

|(/2,eF(Qi + ... + Qd))| 

if F grows sufficiently rapidly. Hence from one further application of the triangle in- 
equality we have 

|(/,eF(Qi + --- + Qd))| ^e/2, 

and thus \\f\\ud ^ Therefore the induction goes through and we have proved 

Theorem 11.31 □ 



7. A RECURRENCE RESULT 



Proposition 15.11 had a rather lengthy proof. However, the claim is much simpler in 
the case when the factor JF is trivial. More precisely, we have the following slight 
generalization of Proposition 4.5]. 

Lemma 7.1 (Non-zero polynomials do not vanish almost everywhere). Suppose that 
P e Pd(IF") and that P^gFn(P(x) = 0) > 1 - 2''^. Then P is identically zero. 



Remark. This lemma is almost certainly folkloric, but we do not have a precise 
reference for it. 

Proof. We proceed by induction on rf, the result being obvious for = 1. For any fixed 
h we have ¥x(z-^^{P{x + h) = P{x) = 0) > 1 — 2^^'^^-'^^. Applying the inductive hypothesis 
to P{x + h) - P{x) e Pd_i(F"), we see that P{x + h) - P{x) = for all x, h. This 
manifestly implies the result. □ 

A short consequence of Lemma F7. II is the following curious recurrence result. 

Lemma 7.2 (Multiple polynomial recurrence). Suppose that d, /c ^ 1 are integers, that 
Pi, . . . , Pfc e Prf(F"') are polynomials and that Xq e F". Then 

P^gF"(Pi(a;) = Pi{xo) for alli = l,...,k)^ 2-(l^l-i)*^'^. 
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Proof. Consider the polynomial 

k 

i=l te¥ 

This polynomial has degree (|F| — l)kd, and clearly Q{xo) 7^ 0. Applying Lemma \77L\ in 
the contrapositive, we conclude 

F,{Q{x) ^ 0) ^ 2-(l^l-i)'='^ 

and the claim follows. □ 

Remark. In the case d < |F|, one could also obtain a qualitative version of Lemma [7^ 
by combining Lemma 12.31 (applied to the factor generated by Pi, ... , Pk) followed by 
Lemma 14. 1[ Of course, the bounds obtained by this approach are far weaker. 



8. Representations that respect degree 

The results of this section and the next are somewhat technical, and by necessity some 
of the notation is a little fearsome. First-time readers may wish to skip to the discussion 
of the counterexample of Theorem II. 4[ which is presented in §10[ 

In previous sections we showed discussed the notion of low-rank polynomials P G 
P(i(F"), which can be expressed as B{Qi, . . . ,Qk) with Qi E Pd_i(F"). In this sec- 
tion we show how (under a regularity assumption on the factor generated by the Qi) 
the function B can be chosen to be a polynomial with controlled degree. 

Definition 8.1. Let JF be a factor of degree ^ 1 on a F". A -monomial is any prod- 
uct of the form 11/=! Qj^ where each Qj belongs to one of the vector spaces Span(.7^(j^.) 
for some dj G {!,...,(/}. The T- degree of the jF-monomial 11/=! Qj defined to be 
X]/=i'^i- If -D ^ 0, we define a T -polynomial of JF-degree at most D to be any linear 
combination of JF- monomials of JF-degree at most D. 

Example. Let F have large characteristic. If JF is the degree 2 factor on F^ consisting of 
the four polynomials X1X2-I-X4, X2-I-X3 and Xi+X^, where Xi, . . . , X5 are 

the coordinate functions, the polynomial (XiX2-|-X3)(Xi-|-X5)'^-|-(Xi-|-X2-|-X3-|-X5)^ 
has jF-degree 9, and so does (X3 — X4)'*(X2 + X3), since X3 — X4 G Span(jF2). o 

In the above example we saw that the jF-degree of a polynomial can exceed the ordinary 
degree due to dependencies among the polynomials in the factor. The following theorem 
can be viewed as a converse to this phenomenon. 

Theorem 8.2 (Degree and jF-degree agree for regular factors). Let ^ d, < |F|. 

Then there exists a growth function F {depending on d and D) with the following prop- 
erty. Suppose that P G P/)(F") is measurable with respect to cr{J-'), where T is an 
F -regular factor of degree d on F". Then P has J-" -degree at most D. 



POLYNOMIALS OVER FINITE FIELDS AND COWERS NORMS 21 

Proof. Let d, D be as above, let F be a rapid growth function to be chosen later, and let 
P, T be as above. Since P is measurable with respect to cr(jF), we have a representation 

P = . . . , -Pl,Afi , • • • , Pd,l, ■ ■ ■ , Pd,Ma) 

for some function : E — F. As F is a finite field, we can view 5 as a polynomial of 
dim(jF) variables, which has individual degree at most |F| — 1 in each of the variables 
(note that all higher degrees can be eliminated since x''^' = x). Thus we can write 

d Mi 

p 'T.'^WXlPly (8-1) 

rGi? i=l j=l 

where R is the set of all tuples r = {rij)i<^i^d;i!^ji^Mi, and the Cr are coefficients in F. 
For each tuple r G -R, we define the weight \r\ of r by the formula 

d Mi 

i=i j=i 

To prove the claim, it suffices to show that Cr = for all tuples r with weight larger 
than D. Suppose for contradiction that this is not the case. Then we can find r with 
|r| > D such that 7^ 0; without loss of generality we may assume that |r| is maximal 
with respect to this property. From flS.ip . we thus have 

d Mi d Mi 

pix) = n n ^^-^■(^)'''^ + E n n ^^-^■(^)^''^ 

i=l j=l se-R\{r}:|s|^|r| i=l j=l 

for all X G F*^. Since P has degree D < |r|, its |r|*^ order derivatives vanish. Thus we 
have 

d Mi 

o=cr Yi (-i)'"'nn^^.^(^+^-^)'''^' 

we{o,i}i'-i «=i i=i 

d Mi 

se-R\{r}:|s|^|r| a;e{0,l}l'-| «=1 J=l 

for all X e F" and h e (F")l'^l. 

Now if a = (aij{uj)) G S^"'^^''^' satisfies the parallelelepiped constraints, and if F grows 
sufficiently rapidly, then we know from Proposition 14.51 that there are x G F" and 
h G (F")l''l such that Pi,j{x + to ■ h) = aij{uj) for all i,j with i G [d] and j ^ Mi and for 
all u G {0, We thus conclude that 

d Mi 

a;G{0,l}l''l *=1 i=l 

d Mi 

se-R\{r}:|s|^|r| a;e{0,l}l'-i *=1 .7=1 
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for all a G 11'^°'^^''^' satisfying the parallelepiped constraints. Thus, to obtain the desired 
contradiction, it will suffice to locate such an a for which 

d h'U 

E (-i)'^'nn«MM"'^^o (8-2) 

but such that 

d M, 

a;e{0,l}l''l *=1 i=l 

for all s G -RVj^'} with |s| ^ |r|. 

We can do this explicitly as follows. Let us parametrise {0, l}'''' as Y[t=i 11^=1 ({0' 1}*)'^''^ 
thus we write each lu G {0, 1}''"' as i^i,j,k,t, where l^i^d, l^j^ Mi, 1 ^ k ^ i and 
1 ^ t ^ Tij. Define a G S^°'i>'''' by 

t=l k=l 

where we embed {0, 1} into F in the obvious way. Since aij{uj) is a linear combination 
of products of i coordinates of u, it is easy to see that a satisfies the parallelepiped 
constraints. 

Let us now verify fl8.3p . For fixed ai_j{uj) depends only on the components lying in 
({0, lyY''^ , which are disjoint as i,j vary. We can therefore factorise the left-hand side 
of fl8.3p (with a hopefully obvious notation) as 

d Mi 

UUi E (-l)''''a..(0,...,0,r/,0,...,0)^-), 

«=1 i=l r?6({0,l}»)'"'J 

where the notation is supposed to suggest that r] is in the z,j-part of the product 
riiLi rijiillO) l}*)*^"'^- On the other hand. If |s| ^ |r| and s 7^ r, then from the pigeon- 
hole principle there must be some i ^ d and some j ^ Mj such that Sij < r^j. Fixing 
this it thus suffices to show that 

E (-l)''''«M(0,---,0,r7,0,...,0)^- =0. 

But we observe that aij{uj)^^'^ is a linear combination of products of isij coordinates of 
u, which is strictly less than ivij, and the claim follows. 

Now we verify (18.21) . Performing the same factorisation as before, it suffices to show 
that 

E (-l)''''a^,,(0,...,0,r/,0,...,0)'--^0 (8.4) 

??e({o,i}»)''i.^ 

for each But aij(0, . . . , 0, ?], 0, . . . , O)'''-' is equal to Vijl Y[je=i Y[t=i Vk,t (viewed of 
course as an element of F) , plus several other monomials, none of which involve all of 
the rjk^t. From this we see that the left-hand side of (18.41) is simply (— l)*'"'Jrj j!. Since 
< |F|, this expression is non-zero in F, as desired. □ 
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Combining this theorem with Lemma we immediately obtain the following corollary. 

Corollary 8.3 (Minimal-degree representation of polynomials). Let 1 ^ d, D < \¥\, 
and let F be a growth function. Then whenever P e VDi^"") is measurable with respect 
to a factor JF of order d on F", there exists an F -regular extension T' of T of order d 
with dim(jF') <t^d,D,dim{:F) 1 such that P has T' -degree at most D. 



9. A NULLSTELLENSATZ 

In this section we establish a kind of finite field analogue of Hilbert's NuUstellensatz. 
These results are not needed elsewhere in the paper, but are illustrative applications of 
the previous machinery, and may be of some independent interest. 

Proposition 9.1 (NuUstellensatz). Let k ^ and ^ d < |F|, and let Pi, . . . , Pk E 
7^d(F"'). Let Q G Pd(F") be such that Q vanishes whenever Pi, . . . ,Pk all vanish. Then 
there exist polynomials Ri, . . . ,Rk of degree Od,k{^) such that 

Q{x) = Pi{x)Ri{x) + ... + Pk{x)Rk{x) 

for allx G F". 

Proof. Let JF be the degree d factor defined by the polynomials Pi, . . . , Pk,Q. Let F be 
a growth function to be chosen later. By Lemma 12.31 we can extend JF to an F-regular 
factor JF' of order d and dimension Od,k,F{^)- If F is sufficiently rapid, then by Lemma 
14.11 we see that the configuration map $' : F" — S' corresponding to JF' is surjective. 
Since Pi, . . . , Pk,Q are measurable with respect to cr(jF'), we can write Pt = PiO $' and 
Q = g o $' for some Pi, g : S' ^ F. Our assumption together with the surjectivity of $' 
implies that if 2; G S' is such that Pi{z) = for i = 1, . . . , k then q{z) = 0. By working 
on each point z separately, one can therefore find functions ri, . . . ,rk : S' F such 
that 

q{z) = pi{z)ri{z) + . . . + pk{z)rk{z) 

for all z G S'. Composing with $' we conclude that 

Q{x) = Pi{x)Ri{x) + ... + Pk{x)Rk{x) 

for all X G F*^, where Ri := o As S' has dimension Od,k,F{^), one can view 
ri, . . . ,rk as polynomials of degree Od,k,F{^), and so Ri, Rk are also polynomials of 
degree Od,k,F{^)- The claim follows. □ 

In the above result the polynomials Ri had bounded degree. However, if the polynomials 
Pi, . . . ,Pk arose from a sufficiently regular factor, one can get the sharp degree bound 
for Ri, namely deg(i?j) = deg{Q) - deg(Pi). 

Proposition 9.2 (Exact nullstellensatz) . Let D,d,k ^ 0. Then there exists a growth 
function F {depending on D,d,k) with the following property: given any F-regular 
factor JF of order d and dimension at most D on F", and given any Q G Pfc(F") which 
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vanishes whenever the polynomials Pij defining T all vanish, there exist polynomials 
Ri,j £ 'Pk-ii^^) for all i ^ min((i, k) and j ^ Mj such that 

min(d,k) 

Q{x)= J2 

i=l 

for allxe F". 

Before embarking on the proof, we give a technical generahsation of the regularity 
lemma, Lemma I2.3[ Let us say that an extension JF' of a factor of order d is non- 
disruptive if we have jFj C J^. for all i = 1,. . . ,d. Clearly if JF' is a non-disruptive 
extension of and JF' is F-regular, then JF must also be F-regular. Our next lemma 
can be regarded as a kind of converse to this fact. 

Lemma 9.3 (Relative regularity lemma). Let d,D ^ 1 and let F be a growth function. 
Then there exists a growth function F such that whenever J-' is a F -regular factor of 
order d on F*^, and JF' is an extension of of dimension at most D, there exists an 
F -regular extension T" of T' with the dimension bound 

dim(r') <^F,d,D 1 (9.1) 
such that J-'" is a non- disruptive extension of J-". 

Proof. Fix d, F, and let F be a sufficiently rapid growth function to be chosen later. 
First observe that as the polynomials in JF are jF'-measurable, we have the crude bound 
dim(jF) <C£) 1, and so we may allow our constants to depend on dim(jF) also. 

By replacing !F- with .F/ U JFj for 1 ^ z ^ d if necessary (and increasing D accordingly) 
we may assume that J-'' is a non-disruptive extension of JF. We now keep JF fixed and 
induct on the dimension vector (dim(jF{), . . . , dim(jF^)) of JF' in exactly the same way 
as in Lemma [23] in order to obtain an F-regular extension JF" of JF' obeying (19. ip . The 
key point is that the low-rank polynomials Qi which arise in the proof of Lemma 12.31 
can never arise from jFj if F is chosen sufficiently rapid (thanks to (19. ip ). Because of 
this, we can easily arrange that the extension JF" appearing in the proof of Lemma 12.31 
continues to be a non-disruptive extension of J-', and the claim easily follows. □ 

Proof of Proposition [97B . Fix D,d,k ^ 0. By adding dummy polynomials to .F and 
enlarging d if necessary we may assume that d ^ k. Let Fi be a growth function 
depending on D,d,k to be chosen later, and let F be an even more rapid growth 
function depending on D,d, k, Fi and also to be chosen later. 

Let JF, Q be as in the statement of the proposition. Let JF' = {Pij)i^[d]j^M! be the 
factor of order d formed by adjoining Q to J-'. Applying Lemma 19.31 we see (if F is 
sufficiently rapid depending on D,d, k, Fi) that we can find an Fi-regular extension 
JF" = (Pjj)ig[d] j;gA/" of JF of order ma.x{d, k) which is a non-disruptive extension of JF. 
Applying Theorem 18.21 we conclude (if Fi is sufficiently rapid depending on d, k) 
that Q has jF"-degree at most k. Using the identity x''^' = a; to eliminate all exponents 
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greater than or equal to |F|, we have a representation Q{x) = q{^"{x)) for all x G F", 
where g : S" — > F is a polynomial which takes the form 

9W:=E^^nn<T' (9-2) 

where G F for all s & Sk, and Sk is the collection of all tuples (sij)i^i^(i;i^j<M" of 
non-negative integers ^ Sij < |F| obeying the weight condition 

d M[' 

^ k. 

i=l j=l 



By hypothesis, Q{x) vanishes whenever all the Pij{x) vanish for i = 1, . . . ,d and j ^ Mj. 
On the other hand, by Lemma 14.11 we see (if Fi is sufficiently rapid) that $" : F" —>■ S" 
is surjective. We conclude that q vanishes on the coordinate subspace 

W:={tE S" : tij = for alH = 1, . . . , and j ^ M^}. 

Restricting q to W and then equating coefficients (recalling from the Lagrange interpo- 
lation formula that the coefficients are uniquely determined as long as all exponents are 
less than |F|) we conclude that vanishes for each s G such that Sij = for all i,j 
with i ^ d and j ^ Mj. From this, we can easily obtain a representation of the form 

d Mi 

(lit) = J^Xl^^niW 

i=l j=l 

where each rj ^ has weighted degree at most k — i in the sense that it can be expanded 
into monomials as in (19.21) but using only exponents from Sk-i rather than all of Sk- In 
particular Vij must vanish for i > k. Substituting t = $"(a;) we obtain the claim. □ 



10. The COUNTEREXAMPLE 



In this section we analyse the counterexample to the inverse conjecture for the Gowers 
norms in characteristic two by proving Theorem II. 4[ Recall what is claimed in that 
theorem: the elementary symmetric quartic 




/y» /y» /y» /y> 



I^il<i2<i3<ii^n 

is such that f{x) = (— 1)'^4(^) has large f/^-norm on F2, but this function does not 
correlate well with any cubic phase. 

We begin by establishing that the [/^-norm of this function is large. Define the sym- 
metric bilinear form i? : Fg x F2 — F2 by 

B{a,b) := Y ^i^j (10-1) 
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for a = (oi, . . . , a„), b = (61, . . . , 6„) in Fg. One readily verified the identity 
DaDbDcDdS4^{x) = ^ aibjCkdi 

l^i,j,k,l^n:i,j,k,l distinct 

= B{a, b)B{c, d) + 5(a, c)B{b, d) + 5(a, c), (10.2) 

and so 

11/11^4 = E,,,,,,rfe^.(-l)^('^'^)^(^''^)+^('^'^)^(^''^)+^('^''^)^('''^). (10.3) 

To compute this quantity, we will need to look at the distribution of the sextuplet 

56(a, 6, c, d) := (5(a, 6), 5(a, c), 5(a, d), 5(6, c), 5(6, d), 5(c, d)) G (10.4) 

as a, 6, c, d vary in F2 . This distribution can be controlled by standard Gauss sum 
estimates such as the following (cf. also Lemma [1.61) . 

Lemma 10.1 (Gauss sum estimate). For 

we have 

■^^^^^^^^(^_l^^S.abB{a,b)+S_acB{a,c)+^adB{a4)+S,hcB(b,c)+S,t,iB[^ = 0(2""/^). 



Proof. By symmetry we may assume ^afe = 1- It suffices to show that 

(^_l^^B{a,b)+iacB{a,c)+S,adB(a4)+S,bcB{b,c)+S,bdB{b4)+^^dB{c,d) ^ 0{2~'^^'^) 

uniformly in c, c? G Fg . But if we fix c, d, we can write the left-hand side as 

E„,,,F5(-l)^('^''')+^('^)+"'('') 

for some L,L' G Vii^"^)- Applying Cauchy-Schwarz to eliminate the (— l)-^'^^) factor, 
we can estimate this quantity in absolute value by 

writing c := a — a' this becomes 

|E.,,,F5(-1)^(^'^)-^"(^V/'- 

Performing the c average using Fourier analysis and using the triangle inequality, we 
can bound this by 

|P,eFn(5(c,6) = for all b G F^)!^/^ 
But B has rank n — 0(1), and so 

P,gFn(5(c, 6) = for all 6 G F^) = 0(2""). 
The claim follows. □ 

From this lemma and Fourier analysis on ¥\ (as in the proof of Lemma WA\ we see that 
Bq is equidistributed in the sense that 

PaAc,deF5(56(a, 6, c, d)=q)= 2"^ + 0(2"") for all q G F^. 

It follows that (110. 3p can be rewritten as 

^qab,qac,qad,qbc,qbd,qcd&2\ ^) '^^\'^ )■ 



^For a generalisation of this identity, see Lemma [1 1 .21 below. 
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But we can factorise the expectation and rewrite this expression as 

(E,,,,eF.(-l)'^'^? + 0(2-"/2). 

Since Eg,,/6F2(-1)^''' = ^ it follows that ||/||^4 = | + 0(2-") as asserted in ([LlD of 
Theorem 11.41 

Now we turn to (11.51) . which asserts that / does not have substantial correlation with 
a cubic phase. Let us remind the reader once more that a better bound is contained 
in the independent work of Lovett, Meshulam and Samorodnitsky [T3]. Our bound is 
all but contained in Alon and Beigel [21 Theorem 7], although we recall that argument 
here for the convenience of the reader. 

If a; = {xi, . . . , Xn) € Fg, let |x| denote the number of indices i G [n] for which Xi = 1. 
It is clear that Sdi^x) = ('^') (mod 2). Recalling Lucas' theorem on binomial coefficients 
(mod p) , which states that 



(mod p) (10.5) 



whenever a = + aip + a2P^ + ■ ■ ■ + a^p^ and 6 = 6o + bip + 62^^ + ■ ■ ■ + hkp'^ with 
^ Oj, 6j < we see that 







= 










= 1 iff 


\x\ 


= l(mod 2) 




[^) 


= 1 iff 


\x\ 


= 2,3(mod 4) 




[^) 


= 1 iff 


\x\ 


= 3 (mod 4) and 






= 1 iff 


\x\ 


= 4, 5, 6, 7(mod ■ 



On the other hand we have, by a technique once knowr0 as "multisection of series" , 
P.eFjdxl = a(mod8)) = 2-" ^ Tj 

j=a(mod8) ^ 

17 , 27Tir/8 



r=0 

= - + 0(2"^(")). 
8 

From these facts and some computation we easily conclude that 

for all coefficients Cq, Ci, C2, C3 G F2. Clearly this immediately implies that 

^^^^^(^_^'jS4,{x)+c-iSa{x)+c2S2{x)+ciSi{x)+coSo-Qo _ (9^2^^^"^) (10 6) 

whenever Qq G ^0(^2) and Cq, 01,02,03 G F2. 



^One can also interpret this computation as exhibiting (by the usual Fourier-analytic method) the 
exponential mixing rate of a simple random walk on Z/8Z. 
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Now suppose instead that Qi G Vi(¥2) and Cq, Ci, C2, C3 G F2, and consider the average 
Then we can write 

for some Qq G Vo{^2) some set C {l,...,n}. We can thus find a set / C 
{1, . . . , n} of size m := \_^\ which either hes in E, or is disjoint from E. By permuting 
the coefficients we can write / = {!,..., m}. Then by freezing the coefficients y : = 
(xm+i, • • • , Xn) G F2~™', wc scc that we can write (110.71) as an average of expressions of 
the form 

for some Co,^, . . . ,C3^y G F2 and Qo^y G Pi(F^). Applying fllU.6l) and the triangle in- 
equality we thus conclude that 

]E_^.gjj,„(_X)'S'4(a;0+C3S'3(a;)+C2S2(a;)+ciSi(a;)+co5o-Qi(a;)-j _ Q(^2~^^^^). (10.8) 



Now suppose instead that Q2 G 'P2(F2) and Cq, Ci, C2, C3 G F2, and consider the average 

W, ( _'\\S4ix)+C3S3{x)+C2S2{x)+ciSi{x)+coSo-Q2{x) 

Then we can write 

Q2{x) = ^ XiXj+Qi{x) 

for some Qi G Pi(F2) and some graph F on vertex set [n] . By Ramsey's theorem (see 
e.g. [ini Section 4.2]), we can find a set J C [n] of size m = f2(logn) such that the 
complete graph on vertex set I either lies completely inside E, or is disjoint from E. 
We can then repeat the above freezing argument (using (110.81) instead of (110. 6p ) and 
conclude that 

^^^^^^_l'jS4{x)+C3Ssix)+C2S2ix)+ciSi{x)+coSo-Q2{x) _ ^^2~^(™)) = 0{n^^^^^) 



Finally, suppose Q3 G Pi(F2 ) and cq, ci, C2, C3 G F2, and consider the average 

]g „ / _'^\S4{x)+C3S3{x)+C2S2{x)+ClSl{x)+CoSo-Q3{x) 

Then we can write 

Qsix) = ^ XiXjXk + Q2ix) 

for some Q2 G 7^2 (F2) and some 3-uniform hypergraph F on vertex set [n]. Applying the 
bounds of Erdos and Rado for the hypergraph Ramsey theorem (see e.g. (TUl Section 
4.7]) we can find a set J C [n] of size m = ^(loglogr;,) such that the complete 3-uniform 
hypergraph on I either lies completely inside E or is disjoint from E. Using the freezing 
argument one last time, we obtain 

^^^^^(^_l-jS4{x)+C3S3{x)+C2S2ix)+ciSi(x)+coSo-Q3{x) ^ 0{m~^^^^) = 0((loglogn)"^^^^). 

(10.9) 

This is a bound of the form claimed in (II. 5p of Theorem 11.41 except there is an extra 
logarithm. To remove it, we run the two Ramsey-theoretic arguments in parallel, by 
using the following variant of the Erdos- Rado bound. 
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Lemma 10.2 (Simultaneous Ramsey theorem). Let E2 C (f^') o,nd E3 C (^^^^ be a 
graph and 3-uniform hypergraph respectively. Then there exists a set I C [n] of size 
m = n(loglogn) such that for each j = 2,3, the set Q either lies completely inside Ej 
or is disjoint from Ej . 

Proof. We generate some vertices xi, . . . ,xi hj the following algorithm: 

• Step 0. Initialise / = and J := [n]. 

• Step 1. By the pigeonhole principle, there exists J' C ,J with \.J'\ ^ 2^*^*^' ^\.J\ 
such that for any i,j G [/] and x G J', the truth value of the statements 
{xi,x} G E2 or {xi,Xj,x} G E^ are independent of x. Fix this J'. 

• Step 2. Set := min(J'), replace J by J'\{x/+i}, and increment / to / + 1. 
If J' is non-empty then return to Step 1; otherwise STOP. 

One easily verifies that this algorithm terminates in k = f2(log^''^n) steps to obtain a 
sequence 1 ^ Xi ^ . . . ^ X; ^ n with the property that for any 1 ^ i < j ^ I, the truth 
value of {xi,Xj} G E2 is independent of j, and for any l^i<j<k^l, the truth 
value of {xi,Xj,Xk} G E^ is independent of k. By an appeal to Ramsey's theorem for 
graphs one can then find a set / C {xi, . . . ,Xk} with |/| ^ log/c ^ loglogn with the 
desired properties. □ 

Note that by applying Ramsey's theorem for graphs and 3-uniform hypergraphs sequen- 
tially, one would only get m = f2(log loglogn) here. The reader can easily verify that 
the logarithmic saving in this lemma propagates through the previous arguments to 
improve ffTIT^ to ([13]). □ 



11. General degrees and characteristics 

It is natural to wonder for which F and d the symmetric polynomials Sd on F" provide 
counterexamples to Conjecture II. 2[ the inverse conjecture for the [/'^-norm. We do 
not have a complete answer to this question, but we give some partial results in this 
direction here. For a more in-depth treatment of these issues, we refer the reader to the 
recent preprint |15j . 

We begin with a general result that shows that ||eF(5'rf)||[/d is large whenever d > 
|F|. This result (and in fact a generalisation of it which establishes the largeness of 
||eF(5'd)||[7d-p+2 for d ^ 2p, where p = |F|) was shown to us by the authors of [13] before 
we wrote this section. The following argument is a slight variant of theirs which, we 
believe, is worth having in the literature. 

Theorem 11.1 (Lower bound on Gowers norm). Let ¥ be a finite field, let n ^ 1, and 
let d > |F|. Let Sd be the symmetric polynomial on F", and let f := ef{Sd)- Then 

II/IIi/^»f1. 
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Proof. For this, we must find some analogue of tlie computations earlier in the section 
and, in particular, the identity fllU.2p . For this we need some more notation. Let 
n„ denote the collection of all partitions tt = {Ci, . . . ,Cjn} of [n] into disjoint sets 
[n] = Ci U . . . U Cm- For any partition vr = {Ci, . . . ,Cm} G n„, we associate the 
multilinear form R^, : ¥^ x . . . x ¥^ ^ ¥ by 

m n 

k=l j=l i&Ck 

Thus for example if vr is the partition of [3] into {1, 2} and {3} then we have 

R^{h^'\ h^'\ h^'^) = {h^hf^ + ... + h^^^h^^^){hf + ... + /.(f)). 

We define the Mohius function yu(7r) of /i at tt by the formula 

^^{7;)■.= \{{-lr'^\\C,\-l)\. (11.1) 
fc 



We place a partial ordering on partitions vr by declaring vr' ^ vr if every set in tt' is 
contained in some set in tt. This has a minimal element TTmin '■= {{I}, • • • , {f^}}- The 
Mobius function can be showiil to obey the Mobius inversion identities fi^n^in) = 1 and 

E^M^ K^') = if TT ^ TT^in. 

As a consequence we obtain the following variant of (110. 2p . which follows from [151 
Proposition 2.7]. 

Lemma 11.2 (Derivative of symmetric function). For any d ^ 1 and h^^\ . . . , h^'^\x E 
F", we have 

D^w . . . D^,,)Sd{x) = 5^/i(7r)i?^(/i(i), . . . , /i('^)). (11.2) 



Proof. Each i?^ may be expanded as a sum 

RAh''\...M'^)= ^---^^ (11-3) 

■K<T(i\,...,in) 

where r(zi, . . . , i^) is the partition on [n] induced by the indices ii, . . . , two elements 
s, t being placed in the same element of this partition if and only if is = it- 



On the other hand, from proof of Theorem 11.41 we have 



l^jl,...,id<ri 
ii,...,id distinct 



11.4) 



The claim now follows from the Mobius inversion formula. □ 



-"See for instance the series of exercises [H p. 103], or fT', Lemma 4.1] 
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To apply the identity flll.2l) . we let \^ C F" be the variety 

V ■.= {xe¥^: Si{x) = S2{x) = ... = Sp{x) = 0}, 

where p = |F| (later on we will specialize to the case p = 2). We claim the identity 

A^d) . . . A^(.)(/lv)(x) = A^d) . . . A^(.)(ly)(x) (11.5) 

for e F". To prove f lll.Sp . it suffices to show that 

Df^i^i) . . .L'^{d)S'rf(x) = 

whenever x, h^^\ . . . , h^'^^ G F" are such that the cube {x + cuih^ 

u!i, . . . , tUrf G {0, 1}} lies in V. But if x, h^^\ . . . , h^^^ are such elements then, by definition 

of V and differentiation, we have 

D,(.,)...D,(,,)S',(x) = (11.6) 

for all j G {1, . . . ,p} and distinct ii, . . . ,ij G {1, . . . , d}. Note from f lll.ip that the 
Mobius function fi^TTtrivj) is invertible in F for all 1 ^ j ^ p, where vrtrivj is the 
trivial partition {{1, . . . , j}} of [j]. By expanding the left-hand side of (111.61) using the 
inversion formula (111.21) . we conclude recursively that 

i?.,,^_^.(/i(^^),...,/ife))=0 

for all 1 ^ j ^ p and distinct ii, . . . ,ij G {1, . . . ,d}. This implies that 

i?,(/i«,...,/i('^)) = 

whenever all sets in vr have cardinality at most p. On the other hand, if any set in vr 
has cardinality greater than p, we see from (111.11) that /i(7r) vanishes in F. The claim 
(111.51) now follows from one last application of (111.21) . 

Using (111.51) and Definition 11.11 we conclude that 

ll/lyllc/d = ||ly||t/d. 

But by monotonicity of Gowers norms (see e.g. [I9l Chapter 11]) we have 
By applying Lemma \77I\ we have |V^|/|F"| 1, and so 

WflvWu^ >F 1. 

On the other hand, we have the Fourier expansion 

ly = E^gFpeF(^l5'i + . . . + CpSp)- 

Using the triangle inequality for Gowers norms (see e.g. [9i Lemma 3.9] or \T9', Chapter 
11]) we conclude that 

||/eF(ei5l + ...+eA)||c;d»Fl 

for some ^i, ■ ■ ■ ,C,p G F. Theorem 111.11 now follows from (11.11) and the hypothesis that 
d>p. □ 

As a consequence of the above theorem, we can completely characterise the behaviour 
of (— l)"^"* in the characteristic 2 case. 
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Theorem 11.3 (Gowers norm behaviour of Sa over F2). Let n ^ 1 and d ^ 1 he 

integers, let F = F2, and let f := (— 1)'^<* where Sa is the d^^ elementary symmetric 
function on F2 . 

. If d= 1,2, then = 0(1). 

• If d is not a power of 2, then lankd-i^Sd) ^ 2 and \\f\\ud ^ ll/IU'' ^ I- 

• If d is a power of 2 which is at least 4, then \\f\\ud 3> 1 and ||/||„<i = 0^(1), 
where 0^(1) goes to zero as n 00 for fixed d. [In particular, Conjecture \l.^ 
fails for the W^-norm on Fg for these values of d.) 

Proof. The cases d = 1,2 can be computed by hand (using Lemma [1.61 for the d = 2 
case). If d is not a power of 2, then from Lucas' theorem (110. 5p we can express Sd as 
a product Sd^Sd-i for some di, ^2 with Q < di,d2 < d and d = di + ^2, which gives the 
desired bound on iai\kd-i{Sd)- By Fourier analysis in Fg^^ we may therefore write 

Thus (—1)'^'* must have an inner product of at least \ with at least one polynomial 
phase of degree strictly less than d, which gives the lower bound on in this case. 

The lower bound on ||/||(7d then follows from (11.21) . 

When d is a, power of 2, one verifies (as in the proof of Theorem II. 4p that Sd{x) = 1 
precisely when x is equal to d, . . . , 2(i — l(mod 2d), whereas Sd' for d' < dis periodic with 
period dividing d. Using multisection of series as before, we can conclude an analogue 
of (110. 6p for Sd instead of S^, and by repeating the Ramsey arguments one obtains the 
desired bound = 0^(1). Finally, the lower bound on ||/||(7d follows from Theorem 

lll.li This establishes all the claims of the theorem. □ 

Remark. When F = F2 and c? is a power of two, the above theorem shows that 
(—1)'^'' does not correlate strongly with any polynomial phase in Fg of order d — 1 or 
less. However, the argument we used to prove this showed that Sd was still locally 
polynomial of degree d — 1 on the subvariety V := {x E ¥2 : Si{x) = S2{x) = 0}, in 
the sense of [12]. This raises the possibility that Conjecture 11.21 may be salvaged by 
working with locally polynomial phases instead of global ones; in fact this formulation 
of the conjecture was already implicit in ^12], Section 13]. 
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